Learning part of speech disambiguation rules using Inductive Logic Programming
نویسندگان
چکیده
A pilot study on inducing rules for part of speech tagging of unrestricted Swedish text is reported. Using the Progol machine-learning system, Constraint Grammar inspired rules were learnt from the part of speech tagged Stockholm-Ume a Corpus. Several thousand disambiguation rules discarding faulty readings of ambiguously tagged words were induced. When tested on unseen data, 97% of the words retained the correct reading after tagging. However, there were still ambiguities in the output after applying the tagging rules | on an average, 1.15 tags/word.
منابع مشابه
Learning Constraint Grammar-style Disambiguation Rules using Inductive Logic Programming
This paper reports a pilot study, in which Constraint Grammar inspired rules were learnt using the Progol machine-learning system. Rules discarding faulty readings of ambiguously tagged words were learnt for the part of speech tags of the Stockholm-Ume£ Corpus. Several thousand disambiguation rules were induced. When tested on unseen data, 98% of the words retained the correct reading after tag...
متن کاملNP chunking using ILP
This is to report the results of approaching the problem of NP chunking using Inductive Logic Programming techniques. The problem, as de-ned in (Ramshaw and Marcus, 1995), is the machine learning of rules that recognise non-recursive, base NPs in text annotated with part-of-speech tags, by tagging each word as beingìnside' oròutside' an NP. (Consecutive NPs are appropriately treated.) The same ...
متن کاملUnsupervised Learning of Disambiguation Rules for Part of Speech Tagging
In this paper we describe an unsupervised learning algorithm for automatically training a rule-based part of speech tagger without using a manually tagged corpus. We compare this algorithm to the Baum-Welch algorithm, used for unsupervised training of stochastic taggers. Next, we show a method for combining unsupervised and supervised rule-based training algorithms to create a highly accurate t...
متن کاملLearning Expressive Models for Word Sense Disambiguation
We present a novel approach to the word sense disambiguation problem which makes use of corpus-based evidence combined with background knowledge. Employing an inductive logic programming algorithm, the approach generates expressive disambiguation rules which exploit several knowledge sources and can also model relations between them. The approach is evaluated in two tasks: identification of the...
متن کاملMining Association Rules in Multiple Relations
The application of algorithms for eeciently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algorithms with standard techniques from the eld of inductive logic programming. We present the system Warmr, which extends Apriori 2] to mine assoc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007